Skip to content

tabletserver: improved state transitions with new stateManager#6396

Merged
sougou merged 19 commits intovitessio:masterfrom
planetscale:ss-ts1-state
Jul 14, 2020
Merged

tabletserver: improved state transitions with new stateManager#6396
sougou merged 19 commits intovitessio:masterfrom
planetscale:ss-ts1-state

Conversation

@sougou
Copy link
Copy Markdown
Contributor

@sougou sougou commented Jun 30, 2020

The tabletserver can now remember its desired state and continuously retry if the first attempt fails. This responsibility was previously with tabletmanager. This change allows us to delete the corresponding code from tabletmanager.

Most sub-component function calls have been normalized to simplify state management. All such functions are now idempotent.

The new stateManager invokes the necessary functions to reach any required state knowing that the functions are idempotent. This allows it to act in a manner that is agnostic of what the current state is.

Strict ordering rules have been defined for how to open and close various services.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vterrors FTW

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"golang.org/x/net/context" -> "context"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be really nice if states are not just plain int64, but their own type, built on int64. It helps understanding and gives compiler protection for a bunch of situations.

Copy link
Copy Markdown
Collaborator

@systay systay Jun 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super-nit: I think it would be nice if we changed the order of these, so that order comes before component. That way, assertions line up better:

verifySubcomponent(t, 1, sm.messager,  testStateClosed)
verifySubcomponent(t, 2, sm.te,  testStateClosed)
verifySubcomponent(t, 3, sm.txThrottler,  testStateClosed)
verifySubcomponent(t, 4, sm.qe,  testStateClosed)
verifySubcomponent(t, 5, sm.watcher,  testStateClosed)
...

@sougou sougou force-pushed the ss-ts1-state branch 2 times, most recently from c42ab86 to c6d295f Compare July 3, 2020 01:06
sougou added 18 commits July 11, 2020 14:01
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Tests still need fixing

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Looks like tm explicitly sets it expecting tablet server to
unset it. We'll need to revisit this.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Also address a few early comments.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
If the requested type is RESTORE, force StateNotConnected.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Some tests became flaky because they expect the state to change
immediately after a call. This change brings the behavior to
be closer to the existing one. But the tabletserver will continue
to retry if an request failed.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Copy link
Copy Markdown
Member

@harshit-gangal harshit-gangal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. Some nits.

Comment on lines +97 to +101
// InitDBConfig initializes the target parameters for the Engine.
func (vse *Engine) InitDBConfig(keyspace string) {
vse.keyspace = keyspace
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this can be renamed to setKeyspace as it is only setting keyspace.

Comment on lines +97 to +101
// InitDBConfig initializes the target parameters for the throttler.
func (t *TxThrottler) InitDBConfig(target querypb.Target) {
t.target = target
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: InitDBConfig -> setTarget

Comment on lines 237 to +241
// Open must be called before sending requests to QueryEngine.
func (qe *QueryEngine) Open() error {
if qe.isOpen {
return nil
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check be not performed under lock or is it protected by higher order?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The state manager ensures that Open and Close are not called concurrently. A lock is needed only if the variable is accessed by other functions, which is the case in the other situations.

There are a few places where I do this check inside a lock even though it's not needed. No harm.

Comment on lines +472 to +473
case <-tmr.C:
log.Fatal("Shutdown took too long. Crashing")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test for this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can't be tested because it will crash the test :).

Copy link
Copy Markdown
Contributor Author

@sougou sougou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The InitDBConfig is a naming convention that implies the completion of an incomplete "New" instantiation. So, giving it other meaningful names will cause this semantic to get lost.

I'm hoping to get rid of this two-phase initialization as part of my refactor. Let's see if I succeed.

Comment on lines 237 to +241
// Open must be called before sending requests to QueryEngine.
func (qe *QueryEngine) Open() error {
if qe.isOpen {
return nil
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The state manager ensures that Open and Close are not called concurrently. A lock is needed only if the variable is accessed by other functions, which is the case in the other situations.

There are a few places where I do this check inside a lock even though it's not needed. No harm.

Comment on lines +472 to +473
case <-tmr.C:
log.Fatal("Shutdown took too long. Crashing")
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can't be tested because it will crash the test :).

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
@sougou sougou merged commit b6c7ffc into vitessio:master Jul 14, 2020
@sougou sougou deleted the ss-ts1-state branch July 14, 2020 02:03
@deepthi deepthi added this to the v7.0 milestone Jul 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants